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Practical simplified procedures are developed in this paper for 
calculating estimates of parameters of the negative binomial distri- 
bution with probability function 


f(x) = — - X - + - k) - P k (l - p) x ; x » 0, 1, 2... 

x'. r (k) 


where 0 < p < 1 and k > 0. Moment estimators, maximum likelihood 
estimators, and estimators based on moments and frequencies in 
selected classes are given both for the complete and for the trun- 
cated (with missing zero class) distribution. To facilitate calcu- 
lation of the various estimators given, a table 1 of the function 
-p In p/ (1 - p) with entries to six decimals at intervals of 0.001 
for the argument p, is included. Illustrative examples are also 
included. 
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ESTIMATION IN THE NEGATIVE BINOMIAL DISTRIBUTION 

SUMMARY 


Practical simplified procedures are developed in this paper for 
calculating estimates of parameters of the negative binomial distri- 
bution with probability function 


p (x + k) k 

f (x) = p k (l - p) x ; x = 0, 1, 2... 

x ! r (k) 


where 0 < p < 1 and k > 0. Moment estimators, maximum likelihood 
estimators, and estimators based on moments and frequencies in 
selected classes are given both for the complete and for the trun- 
cated (with missing zero class) distribution. To facilitate calcu- 
lation of the various estimators given, a table of the function 
-p In p/(l - p) with entries to six decimals at intervals of 0.001 
for the argument p, is included. Illustrative examples are also 
included. 


I. INTRODUCTION 


The negative binomial distribution is used extensively for the 
description of data that are too heterogeneous to be fitted by a 
Poisson distribution. Since much of the data collected in studies 
of atmospheric phenomena exhibit marked heterogeneity, this distri- 
bution is of particular interest to aerospace scientists. It has 
been considered by numerous investigators, among whom are Greenwood 
and Yule [6], Fisher [5], Haldane [7], Ans combe [1], and Bliss and 
Fisher [2]. Samples from this distribution when zero observations 
are missing have been studied by David and Johnson [4], Sampford [10], 
Rider [9], Hartley [8], and by Brass [3]. This paper is primarily 
concerned with estimation in the truncated negative binomial distri- 
bution from which the zero observations are missing. Consideration 


is also given to estimation in this distribution when there is no 
truncation. Tables of the function, -p In p/ (1 - p), which are 
useful in calculating estimates in both cases, are included. 


II. THE PROBABILITY FUNCTION AND ITS MOMENTS 


The probability function of the negative binomial distribution 
may be written as 


f(x) 


r (x + h) k 
P (! 

x! r 00 


p) x ; x = 0, 1, 2..., 


0 ) 


so that f(0) = p k . The form in which this function was considered 
by Fisher [5] follows from (1) when we make the transformation 
q = (1 - p)/p. A form considered by Ans combe [1] follows upon 
making the transformation m = k(l - p)/p = kq. For the purposes 
of this paper, the form given in (1) is considered preferable. 

When the zero observations are removed, the probability function 
for the resulting truncated distribution becomes 


f T (x) 


r (k + x) p k (l - p) x 
■ ; x = 1, 2, 

x'. r (k) (1 - p k ) 


3... 


( 2 ) 


The factorial moments of the truncated distribution are 


r (k + j) 
r (k) (l - p k ) 


l - 


v 


(3) 


From (3) and from (2) it follows that 


2 



ni - I* - <— fe — r) 

1 - pk P 


k(k + 1) 1 - p „ 1 - p 

H2) = — (- 7 -) = n(k + D(— — >, 


1-P 


(4) 


f T (l) = P = 


kp k+ l - P V+1 

T (“““) = UP « 

1 - pk P 


III. ESTIMATION IN THE TRUNCATED DISTRIBUTION 


Since estimating equations which result from equating the first 
two sample moments to corresponding distribution moments do not lead 
to explicit solutions, David and Johnson [4] considered explicit 
estimators based on the first three sample moments, but found that 
they were quite inefficient. Sampford [10] subsequently developed 
a reasonably rapid iterative technique for solving the two»moment 
estimating equations, but ultimately concluded that the values thereby 
obtained could often serve only as first approximations for use in an 
iterative solution to the maximum likelihood est ima ting equations. 
Later Brass [3] derived explicit estimators based on the first two 
moments and the density of ones, which turned out to be reasonably 
efficient for most combinations of distribution parameters. The 
Brass estimators follow when the last equation of (4) is solved for 
pk to obtain 


p k = P/ pp. 


(5) 


and this value is substituted in turn into the first and the second 
equation of (4). On equating the sample mean and variance to the 
distribution mean and variance and the relative frequency of ones 
in the sample to fx(l) = P, the Brass estimators become 


3 



p 


* 


-^(1 - — ) and k* 
s n 



( 6 ) 


where 


x 


R 

! ^T xn x /n, 
x=l 



(x - x) 2 Dx/ (n - 1), 


(7) 


in which is the number of sample observations for which the random 
variable X = x, n is the total number of sample observations and R 
is the largest sample observation. 

Alternate estimators based on the first two moments and the 
density of ones follow when we take logarithms of the third equation 
of (4), solve for (k + 1) to obtain 


(k + 1) 


ln(P/u) 
In p 


(8 


and subsequently substitute this value into the second equation of (4) 
When x and s 2 are equated to [i and ^ 2 , respectively, and n-^n is 
equated to P, the resulting equations become 


( 9 ) 


k** 




+ x 2 
X 



1 . 


Although the estimator given above for p is not in explicit form, 
linear inverse interpolation in the accompanying table of the function 
-p In p/(l - p) quickly yields the required estimate to as many as 
four decimal places. 

The estimators given in (9) and the Brass estimators given in (6) 
utilize information provided by sample values x, s 2 and n x /n, but the 
precise manner in which this information is employed differs in the 
two cases. Actually, with only two parameters to be estimated, 
sufficient information is provided by x and n x /n. As we demonstrate 
below, it is unnecessary to use the sample variance s 2 . 

When the expression for p^ given in equation (5) is substituted 
into the first equation of (4), we simplify to obtain 



( 10 ) 

(u - 1)P + (1 - P) 
k + 1 = — -. 

1 - p 


On taking logarithms of the third equation of (4) and solving for 
(k + 1), we have 
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( 11 ) 


(k + 1) 


ln(P/n) 

In p 


On setting the right side of (4) equal to the right side of the second 
equation of (10) and letting (i = x and P = n-^n, we obtain the following 
estimating equation in p alone. 


G( P ) = 


1 + 


n - n- 


np 


r-P In p ~ 

L i - p _ 



( 12 ) 


With the aid of the accompanying table of -p In p/ (1 - p), it is 
relatively simple to solve this equation for the required estimate 
p*** us i n g linear interpolation as indicated below. We need only 
find two consecutive tabled values of p such that InCnx/n-L) is in 
the interval (Gi, Gi+i). 


p 

G(p) 

Pi 

G(p t ) 

Vo’oV 

p 

ln(nx/n 1 ) 

Pi+i 

G(P i+ l) 


With p*** thus determined, we employ the first equation of (10), 
with (a, = x and P = n-^n to calculate 
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( 13 ) 


0'C*>C* 


x - n t / n 


.1 - p 


>WoV 


The form of k*** is the same as the Brass estimator given in (6), the 
only difference being that here p*** is a root of estimating equation 
(12), whereas the Brass estimator, p*, is given by the first equation 
of (6). To distinguish between the different estimators considered 
here, a single asterisk denotes the Brass estimators, double asterisks 
denote the alternate estimators of equation (9) and triple asterisks 
denote estimators based on the sample mean and the sample proportion 
of ones. 

Of the three estimators considered, those given by Brass enjoy 
the advantage of being easily calculated. The sampling properties 
of the estimators, based on the sample mean and the observed proportion 
of ones, require further investigation, but they are relatively easy 
to calculate with the aid of the accompanying table of -p In p/ (1 - p), 
and they might be expected to be asymptotically more efficient and 
perhaps less affected by bias than the other two estimators. Certainly, 
any of these estimators would be satisfactory as first approximations 
in an iterative solution of the maximum likelihood estimating equations. 


Maximum Likelihood Estimation 


The likelihood function for a random sample of size n from the 
truncated distribution is 


L = 



r (xj -I- k) 

xi! r (k) 


(1 



(14) 


On taking logarithms of (14), differentiating with respect to p and k 
in turn, and equating to zero, we obtain the estimating equations 
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nx 


<$ln L nk nkp^ - '*’ 

dp P 1 - p k 


1 - p 


- 0 , 


din L 
dk 


n In p 


np k ln p 

1 - p k 




(k + j - l)" 1 = 0. 


( 15 ) 


Following Haldane [7] and Sampford [10], these equations may be more 
conveniently rewritten as 


k(l - p) 

r * x » 

p(i - p k ) 


-pin p 
1 - P 


k 

nx 


I 


X=1 


(k + x - l)" 1 


R 

I 

i=x 


( 16 ) 


It is of interest that the first equation of (16) equates the distribution 
mean given by the first equation of (4) to the sample mean, x. The usual 
maximum- likelihood iterative procedures can be employed to arrive at 
solutions, but by taking advantage of the table of -p In p/ (1 - p) and 
following a procedure of Sampford [10], the computational labor involved 
can be greatly reduced. In some instances, the computational labor can 
be still further reduced by modifying Sampford' s procedure. Begin with 
an initial approximation k (i)> which might be obtained using any of the 
estimators previously discussed. Evaluate the right side of the second 
equation of (16) and interpolate in the table of -p In p/ (1 - p) to 
obtain a first approximation P(i). Rewrite the first equation of (16) as 


8 



H(k, p) = 


- x = 0. 


(17) 


! 


k(l - P) 
Pd - P k ) 


The problem of solution is now reduced to that of finding two values 
and k(i+i) in a sufficiently narrow interval with and of 

opposite signs. Once such values have been found, the required estimates 
follow by linear interpolation as indicated below. 


k 

P 

H 

k i 

Pi 

Hi 

W 

A 


k 

P 

0 

k i+l 

Pi+1 

H i+1 


The symbol (~) serves to designate estimators obtained by the principle 
of maximum likelihood. 


IV. ESTIMATION IN THE COMPLETE DISTRIBUTION 

Although numerous estimators for parameters of the negative 
binomial distribution have been proposed, we shall examine here only 
estimators based on (1) the first two moments, (2) the first moment 
and the proportion of zero readings, (3) the first moment and the 
proportion of ones, and (4) the method of maximum likelihood. 

In the parametric form considered by Anscombe [1], which follows 
from equation (1) on setting m = k(l - p)/p, the mean and second central 
moment of the complete negative binomial distribution are, respectively. 
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( 18 ) 


Hi = m > 

Ha = m (! + £)• 


Moment Estimators . 

The usual moment estimators obtained by equating sample moments to 
distribution moments then follow as 


m* = x, 


k* 



(19) 


It has been pointed out by Fisher [2] that x is a fully efficient 
estimator of m, but that the efficiency of k* is somewhat low for some 
combinations of m and k. In general the efficiency of k* is high for 
small values of the mean and large values of k. More precise statements 
concerning the efficiency of k* are given on page 185 of the 1941 paper 
by Fisher [5], and on page 371 of Ans combe's paper [1], 


Estimators Based on Mean and Proportion of Zeros . 

Anscombe [1] found the efficiency of estimators based on the mean 
and the proportion of zeros to be reasonably high for appropriate 
combinations of m and k. The higher efficiencies occur with the smaller 
values of m and the smaller values of k. The estimating equation for 
k in this case is 


x _u** n Q 

(1 + ) k = — , 

k **' n 


( 20 ) 


and the estimator for m is the sample mean as in the usual moment estimators, 
previously discussed. Bliss [2] writes this equation in the form 
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t 


k**ln(l + x/k**) = ln(n/n 0 ). 


( 21 ) 


and suggests solving for k** by a trial-and-error procedure. 

If we adhere to the parametric form of the negative binomial 
probability function given here in (1), the estimating equations in 
the case under consideration assume a form which permits a rapid and 
simple solution by linear inverse interpolation in the table of 
-p In p/(l-p). With the sample mean and the sample proportion of zeros 
equated to corresponding distribution values, the estimating equations, 
when equation (1) is the density, are 


n 0 /n = 


y 


x = k(l - p)/p. 


( 22 ) 


On taking logarithms of the first equation of (12) and solving for k, 
we have 


ln(n 0 /n) 

k = 

In p 


(23) 


On solving the second equation of (22) for k, we have 


k = px/ (1 - p) . 


(24) 


Equate the right side of (23) to the right side of (24) and simplify to 
obtain 


-p In p ln(n/n Q ) 

1 - p x 


(25) 
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In this form, it is a simple matter to evaluate the right side of (25), 
and read the estimate p** from the accompanying table of -p In p/ (1 - p). 
With p** thus determined, the corresponding estimate of k follows from 
(23) as 


* * 



(26) 


with considerable saving in labor over the computational procedures 
otherwise necessary. 

Estimators Based on Mean and Proportion of Ones . 

Estimators based on the mean and the proportion of ones seem likely 
to be preferred over estimators based on the mean and the proportion of 
zeros, when n 2 > n 0 . The properties of these estimators are being inves- 
tigated further, but on the basis of preliminary studies, their inclusion 
here seems warranted. In this case the estimating equations are 


n x /n * kp k (l - p), 
x = k (1 - p)/p. 


(27) 


Divide the first of these equations by the second, and we have 

n-j/nx = p k+1 . (28) 

Take logarithms of (28) and solve for k = 1 to obtain 

ln(n,/nx) 

k + 1 . (29) 

In p 
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Solve the second equation of (27) for (k + 1), and we find 


k + 1 


p(x - 1) 

+ 

1 - p 



( 30 ) 


On equating the right side of (30) to the right side of (29) and 
simplifying, we have 


p In p 1 - p 

(x H ) = ln(nx/n x ). (31) 

1 - P P 


This equation is only slightly more difficult to solve than (25). With 
the aid of the table of -p In p/(l - p), it is quite easy to find 
consecutive values of p such that p*** is in the interval (p x , Pi+]_). 
Once p*** has been determined, k*** follows from (30) as 


k*** = 



(32) 


Note that k*** differs from k x " given in (26) only in the substitution 

f •jU-jU «jU <■ 

of p for p . 

Maximum Likelihood Estimators , 

Although maximum likelihood estimation in the complete negative 
binomial distribution has been quite fully discussed by Fisher [5], 
Anscombe [1], Bliss [2] and others, applicable estimators are included 
here as a matter of convenience. With the estimating equations obtained 
by equating to zero, the partial derivatives of the logarithm of the 
likelihood function with respect to p and k are 
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p 


1 - p 


a p 


Sin L 
Sk 


R 

n In p + 

x=l 



X 

n x V (k+j - l)" 1 ~ 0 . 

L. j 

j-i 


( 33 ) 


These equations reduce to 


k(l - p) 

= m = x, 

P 


-P In p 

1 - p 


k 

nx 


R 

I (k + x - 

x=l 


R 



i=x 


(34) 


Equations (34) can be solved by standard iterative procedures, and here 
again the accompanying table of -p In p/ (1 - p) is useful. As was indi- 
cated in the truncated case, we might begin with a first approximation 
k/j\ and employ the second equation and the table of -p In p/ (1 - p) to 
obtain a first approximation P(i)» Write the first equation of (34) as 


Q(k,p) 


k(l - p) 
P 


x = 0, 


(35) 


and our problem is reduced to finding approximations k(i) and k(i+i) 
such that Q(i) and Q(i+i) are of opposite signs. Final estimates are 
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then obtained by linear interpolation as indicated provided the interval 
between k(i) and k (i+l) sufficiently narrow. 


k 

P 

k(l-p) 

Q(k, P ) = - x 

k (i) 

A 

P(i) 

A 

Q(i) 

k 

P 

0 

k (i+l) 

P (i+D 

Q (i+1) 


Any of the methods previously described will serve to provide a 
satisfactory first approximation k (i)* 


V. ILLUSTRATIVE EXAMPLES 

Complete Negative Binomial Distribution . To illustrate the 
calculation of estimates in the complete negative binomial distribution, 
we consider a sample reported by P. Garman on the counts of red mites on 
apple leaves, which was previously examined by Bliss [2]. These data 
are given below. 


No. mites 
per leaf 

X 

0 

1 

2 

3 

4 

5 

6 

7 

8+ 

No. leaves 
observed 

n x 

70 

38 

17 

10 

9 

3 

2 

1 

0 


15 



X n x 


For this sample, n = 150, n Q 


7 

70, n 2 = 38, 

x=0 



= 172, 


7 

^ x 2 n x = 536, x = 1.14667, and s 2 = 2.27365. 
x=0 


Estimates based on the first two moments follow from equations (19) as 


m* = 1.14667, 


(1.14667) 2 

k* = = 1.16670. 

2.27365 - 1.14667 


Estimates based on the first moment and the proportion of zeros 
follow equation (25), which for the data given here becomes 


-P In p 

1 - p 


ln(150/70) 

= 0.6646 5509. 

1.14667 


Interpolating from the accompanying table, we have 


P** = 0.46391, 


and from equation (26) 
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46391 


k 


** 


.53609 


(1.14667) 


0.9923, 


which agrees with the value more laboriously computed by Bliss without 
benefit of the table employed here. 

Estimates based on the mean and the proportion of ones follow on 
solving equation (31), which for our sample becomes 


J(P) * 


•P In p 

1 - p 


1 - p 

1 /172\ 

1.14667 + - 

= ln( ) 

P 

J \ 38/ 


1.50990832. 


With the aid of the accompanying table, it is quickly established that 
.480 < p*** < .481. The final estimate is determined by linear inter- 
polation as shown below. 


J(P) 


.4810 

1.50967381 

.4808 

1.50990832 

.4800 

1.51084732 


Accordingly, p*** * 0.4808, and from equation (32) 


fcdnfr- A 1 


0.4808 

0.5192 


(1.14667) = 1.0619. 
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Maximum likelihood estimates can be calculated from equations (34) 
with the aid of the table of -p In p/ (1 - p) or by following the technique 
of Bliss and Fisher [2], To four decimal places, both procedures lead 
to the estimate 


k = 1.0246. 


Calculations based on equations (34) and the table of -p In p/(l - p) 
are sketched below. As a first approximation to k, we select k** = 0.9923 
(based on the mean and the proportion of zeros) which we round off for 
ease of calculation to k^ = 1.0000. Accumulating on a desk calculator, 
with k * 1, we obtain 


7 7 

^(x) _1 ^ni = 114.9261904. 
x=l i=x 


On substituting this value into the second equation of (34) we calculate 


- - 1 - - = — (114.9261904) = 0.66817552. 
1 - p 172 


Interpolating in the table of -p In p/ (1 - p), we have as a first 
approximation to the required estimate of p. 


P(l) = 0.46829. 

When these values for k^) and are substituted into the first 

equation of (34), we have 


1(1 - 0.46829) 
0.46829 


1.13543 < x = 1.14667. 
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Since both the moment estimate and the estimate based on the mean 
and the proportion of ones exceed our first approximation, it see m 
appropriate that our second approximation should be greater than the 
first. Accordingly, we select ■ 1.03. Again accumulating on a 

desk calculator, but this time witn k = 1.03, we calculate 


I 


7 

(x + 0.03)" 1 

i=x 



ni = 112.1650693. 


As with the previous approximation, this value is substituted into the 
right side of the second equation of (34) and on interpolating in the 
table of -p In p/ (1 - p), we find as the second approximation 


P(2) = 0.47267. 


Our final estimate, k = 1.0246, is arrived at from the first equation 
of (34) by interpolation as shown below. 


k 

k(l - p)/p 

1.0300 

1.14911 

k = 1 . 0246 

1.14667 = x 

1.0000 

1.13543 


To the number of decimals given, the value obtained here for k is in 
agreement with that calculated (perhaps more laboriously) by Bliss. 


Auxiliary tabulations involved in the above calculations are 
included in the following table. 
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A summary of the various estimates for the particular s amp 1 e under 
consideration is contained in the following table. 



Estimates 

M.L. 

Moments 

Mean and 
Freq. of Zeros 

Mean and 
Freq. of Ones 

1.02459 

1.16670 

0.9923 

1.0619 

1.14667 

1.14667 

1.14667 

1.14667 

0.47189 

0.50433 

0.46391 

0.4808 

1.11915 

0.98283 

1.15557 

1.07983 
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In this example, estimates based on the mean and the frequency of 
zeros are in closest agreement with the maximum likelihood estimates 
while the moment estimates differ by the greatest amount. 

Truncated Negative Binomial Distribution . To illustrate estimation 
in the truncated negative binomial distribution, we consider a sample of 
chromosome breakage which was originally presented by Sampford [10], 

Data for this sample follows. 


No. Breaks 

X 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

No* Observations 

n x 

11 

6 

4 

5 

0 

1 

0 

2 

1 

0 

1 

0 

1 


For these data, n = 32, n x = 11, 


13 


x=l 


x 


n x ~ HO, 


z 

tr=1 


x^x = 686, 


x = 3.4375, and s 2 = 9.9315. 


Estimates based on the first two moments as computed by Sampford are 


k = 0.633 and p = 0.2346. 


Brass estimates based on the first two moments and the proportion 
of ones follow from equations (6) as 


3.4375 / U\ 

9.9315 V 32/ 


0.2345, 


0,2345 (3.4375) - (11/32) 
0.7655 


0.6040. 
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Alternate estimators based on the first two moments and the 
proportion of ones follow from equations (9). For these data, the 
first equation of (9) becomes 


P In p 

1 - p 


J 


3.4375 

9.9315 + 3.4375 2 - 3.4375 



0.439814. 


Inverse linear interpolation in our table yields the required estimate 

p** = 0.2307. 


From the second equation of (9), we have 


0 2307 

k* = — [5.2363636] - 1 - 0.570. 

0.7693 


Estimates based only on the mean and the proportion of ones follow 
from equations (12) and (13). For those data, equation (12) becomes 


G(p) = 


2.4375 


+ 


21 1 T-p In p" 

32p J L 1 - P . 



2.30258509. 


With the aid of the table of -p In p/ (1 - p), we quickly determine that 
the required estimate p*** is in the interval (0.202 to 0.203), and we 
interpolate for the final estimate as indicated. 
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p 

G(P) 

0.2030 

2.30291889 

0.2025 

2.30258509 

0.2020 

2.30226898 


With p*** = 0.2025, we substitute in equation (13) and compute 




0.2025 (3.4375) - 11/32 
.7975 


0.4418. 


Maximum likelihood estimates can be computed from equations (16) 
with the aid of the table of -p In p/(l - p) as described by Sampford [10] 
in much the same manner as maximum likelihood estimates were calculated 
in this paper for the complete negative binomial distribution. Alter- 
nately, the technique described by Bartley [8] might be used. In either 
case the final estimates for the sample under consideration are 
p « 0.2113 and £ = 0.493. 

A summary of the various estimates for the sample discussed is 
presented in the following table. 


Parameters 

Estimates 


M.L. 

Moments 

Brass 

Alternate 

Mean and 
Prop . Ones 

k 

0.493 

0.633 

0.6040 

0.570 

0.4418 

m 

1.8402 

2.0652 

1.9717 

1.9007 

1.7399 

P 

0.2113 

0.2346 

0.2345 

0.2307 

0.2025 

q 

3.7326 

3.2626 

3.2644 

3.3346 

3.9383 
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Attention is invited to the close agreement exhibited here between 
estimates based on the mean and the proportion of ones with the maximum 
likelihood estimates in contrast with the rather wide discrepancies 
between the moment estimates and the maximum likelihood estimates. 


VI. SOME REMARKS ON RELIABILITY OF ESTIMATES 

Asymptotic variances of estimates in the complete negative binomial 
distribution have been given by Anscombe [1], and by Bliss and Fisher [2]. 
Similar results in the truncated case were given by Sampford [10] and 
by Hartley [8], In the interest of completeness, these results are 
presented here without proof. 

In the complete negative binomial distribution, 


V(x) 



V(k*) 


2k (k + 1) 
n(l - p) 2 ' 


(36) 


-k 


V(k**) 


k(l - p) 


n[-ln p - (1 - p)] : 


(37) 


a . 2k (k + 1) 
V(k) 


n(l 



i + il 1 - p) + 


3(1 - p) 2 


3(k + 2) (k + 2) (k + 3) 


(38) 
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The variances of k*** and p*** i n the complete negative binomial 
distribution remain to be determined. In the preceding variances, a 
single asterisk (*) denotes a moment estimate, double asterisks (**) 
denote an estimate based on the mean and the proportion of ones, the 
circumflex (~) denotes maximum likelihood estimates, while triple 
asterisks (***) denote estimates based on the mean and the proportion 
of ones. 

In the truncated negative binomial distribution, variances and 
covariances of the maximum likelihood estimates are obtained in the 
usual manner by inverting the information matrix in which the components 
are expected values of the quantities 


-S 2 ln L _ nk[l - (k + 1) p k ] nx 

BP 2 P 2 (l - P k ) 2 (1 - P) 2 


-B 2 ln L n[l - (1 - k In p) p k ] 
Bp Sk P (1 - P k ) 2 


(39) 


-B g ln L 
Sk 2 


R 

l < k + x 

x<L 


R 



i=x 


n(ln p) 2 p k 
(1 - P k ) 2 * 


It is usually satisfactory to use these quantities themselves rather than 
their expected values. 

Variances of the ordinary moment estimators (based on the first two 
moments) are given by Sampford [10], while Brass [3] gave variances of 
his est ima tors . In both of these cases the expressions obtained are 
rather complicated and are not repeated here. 
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For the distribution of red mites on apple leaves, considered in 
Section V, it follows from (36) that 


. 1.14667 + 1.28330 

V(x) ■ - 0.01620, 

150 


2(1.1667) (1.1667 + 1) 

V(k*) 4 = 0.1372. 

150(0. 49567) 2 


Accordingly, Sj * 0.1273 and s^* * 0.370. 
From equation (38) it follows that 


V(k) = 0.07614, and s£ = 0.276. 


In the example of chromosome breakage employed to illustrate 
estimation in the truncated negative binomial distribution, Sampford {10] 
calculated moment estimate variances and covariances to be 


V(p*) 4 0.015091, Cov (p* , k*) • 0.08125, and V(k*) 4 0.4983. 


Corresponding values for maximum likelihood estimates were found to be 
V(p) 4 0.009863, Cov(0, £) 4 0.04719, and V(k) 4 0.2763. 
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ERRATA SHEET 


NASA TM X- 5337 2, "Estimation in the Negative Binomial Distribution," 
by A. Clifford Cohen, Jr,, Marshall Space Flight Center, December 21, 
1965. 


Enclosed is an erratum for TM X-53372. This equation (the first 
part of equation (9) on page 5) was omitted in the reproduction process, 
AH recipients of this report should remove the backing from the 
enclosed equation and stick the equation on the top of page 5 just 
above the first equation appearing on that page. 
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